Roll No:

|        |                           | Format No. | :ACD11A-II |
|--------|---------------------------|------------|------------|
| KLNCIT | CENTRALIZED INTERNAL TEST | Issue No.  | :01        |
|        | QUESTION                  | Rev No.    | :00        |

Subject Code/Subject Name: CS6303-Computer Architecture Year and Branch : IIyr / Common to CSE &IT : 20.09.2017 Date

CIT No. : II **Total marks: 50** : 1 hour 30mins Duration

## I. Course outcomes, Question Number, Marks

| COs         | CO1 | CO2 | CO3         | CO4             | CO5 |
|-------------|-----|-----|-------------|-----------------|-----|
| Q. Nos      |     |     | 1-5,11a/11b | 6-10, 12a / 12b |     |
| Marks (Max) |     |     | 25          | 25              |     |

## II. Knowledge skill outcomes

| Level       | Remember<br>(K1) | Understand<br>(K2)                  | Apply<br>(K3)         | Analysis<br>(K4) | Evaluate<br>(K5) | Create<br>(K6) |
|-------------|------------------|-------------------------------------|-----------------------|------------------|------------------|----------------|
| Q. Nos      | 1,3,4,6-10       | 5,<br>11(a)/11(b),<br>12a(i)/12b(i) | 2                     |                  |                  |                |
| Marks (Max) | 16               | 32                                  | 2                     |                  |                  |                |
| PART – A    |                  |                                     | $10 \times 2 = 20$ Ma | arks             |                  |                |

## Answer all the questions

1. What do you mean by pipeline bubble?

A stall initiated in order to resolve a hazard.

2. Assume all variables are in memory and are addressable offsets from \$t0:

lw \$t1, 0(\$t0) lw \$t2, 4(\$t0) add \$t3, \$t1,\$t2 sw \$t3, 12(\$t0) lw \$t4, 8(\$t0) add \$t5. \$t1.\$t4 sw \$t5, 16(\$t0)

Reorder the instructions to avoid any pipeline stalls.

lw \$t1, 0(\$t0) lw \$t2, 4(\$t0)

lw \$t4, 8(\$t0) add \$t3, \$t1,\$t2 sw \$t3, 12(\$t0) add \$t5, \$t1,\$t4

sw \$t5, 16(\$t0)

3. What is delayed branch?

The delayed branch always executes the next sequential instruction, with the branch taking place after that one instruction delay. It is hidden from the MIPS assembly language programmer because the assembler can automatically arrange the instructions to get the branch behavior desired by the programmer.

4. Why forwarding or bypassing is necessary?

forwarding Also called bypassing. A method of resolving a data hazard by retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer visible registers or memory.

5. How a 2-bit branch prediction scheme is better than 1-bit branch prediction scheme?

Compare your result.

By using 2 bits rather than 1, a branch that strongly favors taken or not taken—as many branches do—will be mispredicted only once. The 2 bits are used to encode the four states in the system. The 2-bit scheme is a general instance of a counterbased predictor, which is incremented when the prediction is accurate and decremented otherwise, and uses the mid-point of its range as the division between taken and not taken.

(K3)

**K**1

(K1)

(K1)

(K2)

6. What are the primary methods for increasing the instruction level parallelism (ILP)?

First is increasing the depth of the pipeline to overlap more instructions.

Another approach is to replicate the internal components of the computer so that it can launch multiple instructions in every pipeline stage.

7. Define speculation.

**Speculation** An approach whereby the compiler or processor guesses the outcome of an instruction to remove it as a dependence in executing other instructions.

8. What is Very Long Instruction Word (VLIW)?

**Very Long Instruction Word (VLIW)** A style of instruction set architecture that launches many operations that are defined to be independent in a single wide instruction, typically with many separate opcode fields.

9. What are NUMA and SMP?

A shared memory multiprocessor (SMP) is one that offers the programmer a single physical address space across all processors, which is nearly always the case for multicore chips although a more accurate term would have been shared-address multiprocessor.

**Nonuniform memory access (NUMA)** A type of single address space multiprocessor in which some memory accesses are much faster than others depending on which processor asks for which word.

10. Brief about multithreading.

**Hardware multithreading** Increasing utilization of a processor by switching to another thread when one thread is stalled. **Fine-grained multithreading** A version of hardware multithreading that implies switching between threads after every instruction.

**Coarse-grained multithreading** A version of hardware multithreading that implies switching between threads only after significant events, such as a last-level cache miss.





(**OR**)

(b) Explain in detail about how exceptions are handled in MIPS architecture.

| Type of event                                 | From where? | MIPS terminology       |
|-----------------------------------------------|-------------|------------------------|
| I/O device request                            | External    | Interrupt              |
| Invoke the operating system from user program | Internal    | Exception              |
| Arithmetic overflow                           | Internal    | Exception              |
| Using an undefined instruction                | Internal    | Exception              |
| Hardware malfunctions                         | Either      | Exception or interrupt |

(K2)

 $2 \times 15 = 30$  Marks (K2)

(K1)

(K1)

(K1)

(K1)

(K1)

| Exception type                                                                   | Exception vector address (in hex)                                    |
|----------------------------------------------------------------------------------|----------------------------------------------------------------------|
| Undefined instruction                                                            | 8000 0000 <sub>hex</sub>                                             |
| Arithmetic overflow                                                              | 8000 0180 <sub>hex</sub>                                             |
| Status Register                                                                  |                                                                      |
| EPC                                                                              |                                                                      |
| Vectored Interrupt                                                               |                                                                      |
| 12. (a) Explain in detail about Flynn's classificatio                            | n. (K2)                                                              |
| SISD                                                                             |                                                                      |
| SIMD                                                                             |                                                                      |
| MISD                                                                             |                                                                      |
| MIMD                                                                             |                                                                      |
|                                                                                  | (OR)                                                                 |
| (b) Explain Instruction level parallelism. State t                               | the challenges of parallel processing. (K2)                          |
|                                                                                  |                                                                      |
| Static multiple issues                                                           |                                                                      |
| Dynamic multiple issues                                                          |                                                                      |
| Issue slots                                                                      |                                                                      |
| Speculation<br>Issue packet                                                      |                                                                      |
| VLIW                                                                             |                                                                      |
| Example                                                                          |                                                                      |
| Challenges:                                                                      |                                                                      |
| The fist reason is that you <i>must</i> get better performed on a multiprocessor | ormance or better energy efficiency from a parallel processing progr |
| Speed-up challenge                                                               |                                                                      |
| Strong scaling                                                                   |                                                                      |
| Weak scaling                                                                     |                                                                      |

\*\*\*\*\*